Overview

Dataset statistics

Number of variables10
Number of observations226537
Missing cells455568
Missing cells (%)20.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.3 MiB
Average record size in memory80.0 B

Variable types

Numeric10

Alerts

wime_komfort is highly correlated with df_index and 8 other fieldsHigh correlation
wime_sauberkeit is highly correlated with wime_personal and 6 other fieldsHigh correlation
wime_platzangebot is highly correlated with wime_personal and 6 other fieldsHigh correlation
wime_gesamtzuf is highly correlated with wime_personal and 7 other fieldsHigh correlation
wime_preis_leistung is highly correlated with df_index and 6 other fieldsHigh correlation
wime_personal is highly correlated with wime_komfort and 7 other fieldsHigh correlation
wime_puenktlich is highly correlated with wime_personal and 4 other fieldsHigh correlation
wime_fahrplan is highly correlated with wime_personal and 6 other fieldsHigh correlation
df_index is highly correlated with wime_komfort and 1 other fieldsHigh correlation
wime_oes_fahrt is highly correlated with wime_personal and 3 other fieldsHigh correlation
wime_personal has 149074 (65.8%) missing values Missing
wime_komfort has 50395 (22.2%) missing values Missing
wime_sauberkeit has 47232 (20.8%) missing values Missing
wime_puenktlich has 46621 (20.6%) missing values Missing
wime_platzangebot has 45836 (20.2%) missing values Missing
wime_gesamtzuf has 38474 (17.0%) missing values Missing
wime_preis_leistung has 15301 (6.8%) missing values Missing
wime_fahrplan has 8190 (3.6%) missing values Missing
wime_oes_fahrt has 54445 (24.0%) missing values Missing
df_index has unique values Unique
wime_komfort has 2918 (1.3%) zeros Zeros
wime_puenktlich has 4598 (2.0%) zeros Zeros
wime_platzangebot has 6262 (2.8%) zeros Zeros
wime_preis_leistung has 6364 (2.8%) zeros Zeros
wime_fahrplan has 5237 (2.3%) zeros Zeros

Reproduction

Analysis started2022-11-22 17:30:56.545845
Analysis finished2022-11-22 17:31:22.646434
Duration26.1 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct226537
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113441.2574
Minimum0
Maximum229488
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:22.739633image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile11499.8
Q156807
median113441
Q3170075
95-th percentile215382.2
Maximum229488
Range229488
Interquartile range (IQR)113268

Descriptive statistics

Standard deviation65397.92223
Coefficient of variation (CV)0.576491514
Kurtosis-1.199839849
Mean113441.2574
Median Absolute Deviation (MAD)56634
Skewness2.43676751 × 10-5
Sum2.569864213 × 1010
Variance4276888232
MonotonicityNot monotonic
2022-11-22T18:31:22.852815image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2263351
 
< 0.1%
758131
 
< 0.1%
758441
 
< 0.1%
758461
 
< 0.1%
758271
 
< 0.1%
758261
 
< 0.1%
758251
 
< 0.1%
758241
 
< 0.1%
758081
 
< 0.1%
758091
 
< 0.1%
Other values (226527)226527
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
2294881
< 0.1%
2293451
< 0.1%
2292871
< 0.1%
2288031
< 0.1%
2287381
< 0.1%
2287171
< 0.1%
2286871
< 0.1%
2284981
< 0.1%
2284201
< 0.1%
2284001
< 0.1%

wime_personal
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing149074
Missing (%)65.8%
Infinite0
Infinite (%)0.0%
Mean89.87376769
Minimum0
Maximum100
Zeros688
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:22.945273image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50
Q177.77777778
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)22.22222222

Descriptive statistics

Standard deviation17.83286506
Coefficient of variation (CV)0.198421247
Kurtosis6.721953352
Mean89.87376769
Median Absolute Deviation (MAD)0
Skewness-2.361162011
Sum6961891.667
Variance318.0110763
MonotonicityNot monotonic
2022-11-22T18:31:23.021549image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10050819
 
22.4%
759400
 
4.1%
88.888888895367
 
2.4%
77.777777784839
 
2.1%
66.666666671877
 
0.8%
501711
 
0.8%
44.44444444885
 
0.4%
55.55555556817
 
0.4%
0688
 
0.3%
25431
 
0.2%
Other values (3)629
 
0.3%
(Missing)149074
65.8%
ValueCountFrequency (%)
0688
 
0.3%
11.11111111138
 
0.1%
22.22222222227
 
0.1%
25431
 
0.2%
33.33333333264
 
0.1%
44.44444444885
 
0.4%
501711
 
0.8%
55.55555556817
 
0.4%
66.666666671877
 
0.8%
759400
4.1%
ValueCountFrequency (%)
10050819
22.4%
88.888888895367
 
2.4%
77.777777784839
 
2.1%
759400
 
4.1%
66.666666671877
 
0.8%
55.55555556817
 
0.4%
501711
 
0.8%
44.44444444885
 
0.4%
33.33333333264
 
0.1%
25431
 
0.2%

wime_komfort
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing50395
Missing (%)22.2%
Infinite0
Infinite (%)0.0%
Mean78.91358803
Minimum0
Maximum100
Zeros2918
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:23.100096image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile33.33333333
Q175
median77.77777778
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation22.84215857
Coefficient of variation (CV)0.2894578632
Kurtosis1.473248838
Mean78.91358803
Median Absolute Deviation (MAD)22.22222222
Skewness-1.241285089
Sum13899997.22
Variance521.7642082
MonotonicityNot monotonic
2022-11-22T18:31:23.178511image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10067219
29.7%
7536387
16.1%
77.7777777817438
 
7.7%
88.8888888912623
 
5.6%
66.6666666710817
 
4.8%
5010106
 
4.5%
55.555555565996
 
2.6%
44.444444444845
 
2.1%
02918
 
1.3%
252750
 
1.2%
Other values (3)5043
 
2.2%
(Missing)50395
22.2%
ValueCountFrequency (%)
02918
 
1.3%
11.111111111002
 
0.4%
22.222222221685
 
0.7%
252750
 
1.2%
33.333333332356
 
1.0%
44.444444444845
 
2.1%
5010106
 
4.5%
55.555555565996
 
2.6%
66.6666666710817
 
4.8%
7536387
16.1%
ValueCountFrequency (%)
10067219
29.7%
88.8888888912623
 
5.6%
77.7777777817438
 
7.7%
7536387
16.1%
66.6666666710817
 
4.8%
55.555555565996
 
2.6%
5010106
 
4.5%
44.444444444845
 
2.1%
33.333333332356
 
1.0%
252750
 
1.2%

wime_sauberkeit
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing47232
Missing (%)20.8%
Infinite0
Infinite (%)0.0%
Mean79.30538902
Minimum0
Maximum100
Zeros1583
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:23.258820image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile44.44444444
Q175
median77.77777778
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation21.44728534
Coefficient of variation (CV)0.2704391922
Kurtosis1.142929158
Mean79.30538902
Median Absolute Deviation (MAD)22.22222222
Skewness-1.102926376
Sum14219852.78
Variance459.9860485
MonotonicityNot monotonic
2022-11-22T18:31:23.335736image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10066268
29.3%
7539181
17.3%
77.7777777818011
 
8.0%
88.8888888913642
 
6.0%
5012495
 
5.5%
66.6666666710775
 
4.8%
55.555555565581
 
2.5%
44.444444444508
 
2.0%
253170
 
1.4%
33.333333332167
 
1.0%
Other values (3)3507
 
1.5%
(Missing)47232
20.8%
ValueCountFrequency (%)
01583
 
0.7%
11.11111111606
 
0.3%
22.222222221318
 
0.6%
253170
 
1.4%
33.333333332167
 
1.0%
44.444444444508
 
2.0%
5012495
 
5.5%
55.555555565581
 
2.5%
66.6666666710775
 
4.8%
7539181
17.3%
ValueCountFrequency (%)
10066268
29.3%
88.8888888913642
 
6.0%
77.7777777818011
 
8.0%
7539181
17.3%
66.6666666710775
 
4.8%
55.555555565581
 
2.5%
5012495
 
5.5%
44.444444444508
 
2.0%
33.333333332167
 
1.0%
253170
 
1.4%

wime_puenktlich
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing46621
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean88.91320579
Minimum0
Maximum100
Zeros4598
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:23.415644image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile33.33333333
Q188.88888889
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)11.11111111

Descriptive statistics

Standard deviation22.12768183
Coefficient of variation (CV)0.2488683389
Kurtosis6.159402156
Mean88.91320579
Median Absolute Deviation (MAD)0
Skewness-2.509552242
Sum15996908.33
Variance489.6343031
MonotonicityNot monotonic
2022-11-22T18:31:23.494046image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
100124288
54.9%
7517208
 
7.6%
88.8888888911702
 
5.2%
77.777777787321
 
3.2%
04598
 
2.0%
504093
 
1.8%
66.666666672952
 
1.3%
252093
 
0.9%
44.444444441566
 
0.7%
55.555555561485
 
0.7%
Other values (3)2610
 
1.2%
(Missing)46621
 
20.6%
ValueCountFrequency (%)
04598
 
2.0%
11.11111111629
 
0.3%
22.22222222988
 
0.4%
252093
 
0.9%
33.33333333993
 
0.4%
44.444444441566
 
0.7%
504093
 
1.8%
55.555555561485
 
0.7%
66.666666672952
 
1.3%
7517208
7.6%
ValueCountFrequency (%)
100124288
54.9%
88.8888888911702
 
5.2%
77.777777787321
 
3.2%
7517208
 
7.6%
66.666666672952
 
1.3%
55.555555561485
 
0.7%
504093
 
1.8%
44.444444441566
 
0.7%
33.33333333993
 
0.4%
252093
 
0.9%

wime_platzangebot
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing45836
Missing (%)20.2%
Infinite0
Infinite (%)0.0%
Mean80.26031646
Minimum0
Maximum100
Zeros6262
Zeros (%)2.8%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:23.574165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile22.22222222
Q175
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation26.7383851
Coefficient of variation (CV)0.3331457722
Kurtosis1.385339091
Mean80.26031646
Median Absolute Deviation (MAD)0
Skewness-1.454391955
Sum14503119.44
Variance714.941238
MonotonicityNot monotonic
2022-11-22T18:31:23.652474image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10091346
40.3%
7525485
 
11.2%
77.7777777812081
 
5.3%
88.8888888910341
 
4.6%
509523
 
4.2%
66.666666676876
 
3.0%
06262
 
2.8%
254450
 
2.0%
55.555555564096
 
1.8%
44.444444443937
 
1.7%
Other values (3)6304
 
2.8%
(Missing)45836
20.2%
ValueCountFrequency (%)
06262
 
2.8%
11.111111111549
 
0.7%
22.222222222316
 
1.0%
254450
 
2.0%
33.333333332439
 
1.1%
44.444444443937
 
1.7%
509523
 
4.2%
55.555555564096
 
1.8%
66.666666676876
 
3.0%
7525485
11.2%
ValueCountFrequency (%)
10091346
40.3%
88.8888888910341
 
4.6%
77.7777777812081
 
5.3%
7525485
 
11.2%
66.666666676876
 
3.0%
55.555555564096
 
1.8%
509523
 
4.2%
44.444444443937
 
1.7%
33.333333332439
 
1.1%
254450
 
2.0%

wime_gesamtzuf
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing38474
Missing (%)17.0%
Infinite0
Infinite (%)0.0%
Mean84.58338725
Minimum0
Maximum100
Zeros2040
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:23.999983image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50
Q175
median88.88888889
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation19.58225893
Coefficient of variation (CV)0.2315142438
Kurtosis3.750400158
Mean84.58338725
Median Absolute Deviation (MAD)11.11111111
Skewness-1.725912029
Sum15907005.56
Variance383.4648649
MonotonicityNot monotonic
2022-11-22T18:31:24.126877image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10087829
38.8%
7537915
16.7%
88.8888888920359
 
9.0%
77.7777777816801
 
7.4%
507048
 
3.1%
66.666666676768
 
3.0%
55.555555562801
 
1.2%
44.444444442196
 
1.0%
02040
 
0.9%
251869
 
0.8%
Other values (3)2437
 
1.1%
(Missing)38474
17.0%
ValueCountFrequency (%)
02040
 
0.9%
11.11111111475
 
0.2%
22.22222222915
 
0.4%
251869
 
0.8%
33.333333331047
 
0.5%
44.444444442196
 
1.0%
507048
 
3.1%
55.555555562801
 
1.2%
66.666666676768
 
3.0%
7537915
16.7%
ValueCountFrequency (%)
10087829
38.8%
88.8888888920359
 
9.0%
77.7777777816801
 
7.4%
7537915
16.7%
66.666666676768
 
3.0%
55.555555562801
 
1.2%
507048
 
3.1%
44.444444442196
 
1.0%
33.333333331047
 
0.5%
251869
 
0.8%

wime_preis_leistung
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing15301
Missing (%)6.8%
Infinite0
Infinite (%)0.0%
Mean73.89085352
Minimum0
Maximum100
Zeros6364
Zeros (%)2.8%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:24.223671image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile25
Q150
median75
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)50

Descriptive statistics

Standard deviation26.43170844
Coefficient of variation (CV)0.3577128586
Kurtosis0.2255697471
Mean73.89085352
Median Absolute Deviation (MAD)25
Skewness-0.9196123172
Sum15608408.33
Variance698.6352108
MonotonicityNot monotonic
2022-11-22T18:31:24.308076image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10075551
33.4%
7545071
19.9%
5025270
 
11.2%
77.7777777813293
 
5.9%
66.6666666710076
 
4.4%
258402
 
3.7%
88.888888897804
 
3.4%
06364
 
2.8%
55.555555566284
 
2.8%
44.444444446253
 
2.8%
Other values (3)6868
 
3.0%
(Missing)15301
 
6.8%
ValueCountFrequency (%)
06364
 
2.8%
11.111111111219
 
0.5%
22.222222222588
 
1.1%
258402
 
3.7%
33.333333333061
 
1.4%
44.444444446253
 
2.8%
5025270
11.2%
55.555555566284
 
2.8%
66.6666666710076
 
4.4%
7545071
19.9%
ValueCountFrequency (%)
10075551
33.4%
88.888888897804
 
3.4%
77.7777777813293
 
5.9%
7545071
19.9%
66.6666666710076
 
4.4%
55.555555566284
 
2.8%
5025270
 
11.2%
44.444444446253
 
2.8%
33.333333333061
 
1.4%
258402
 
3.7%

wime_fahrplan
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing8190
Missing (%)3.6%
Infinite0
Infinite (%)0.0%
Mean83.35784834
Minimum0
Maximum100
Zeros5237
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:24.397898image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile25
Q175
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation23.89661238
Coefficient of variation (CV)0.2866750145
Kurtosis2.525530506
Mean83.35784834
Median Absolute Deviation (MAD)0
Skewness-1.682809537
Sum18200936.11
Variance571.0480832
MonotonicityNot monotonic
2022-11-22T18:31:24.481357image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
100118187
52.2%
7532842
 
14.5%
77.7777777814678
 
6.5%
88.8888888912666
 
5.6%
5010663
 
4.7%
66.666666677748
 
3.4%
05237
 
2.3%
254365
 
1.9%
55.555555563956
 
1.7%
44.444444443745
 
1.7%
Other values (3)4260
 
1.9%
(Missing)8190
 
3.6%
ValueCountFrequency (%)
05237
 
2.3%
11.11111111840
 
0.4%
22.222222221509
 
0.7%
254365
 
1.9%
33.333333331911
 
0.8%
44.444444443745
 
1.7%
5010663
 
4.7%
55.555555563956
 
1.7%
66.666666677748
 
3.4%
7532842
14.5%
ValueCountFrequency (%)
100118187
52.2%
88.8888888912666
 
5.6%
77.7777777814678
 
6.5%
7532842
 
14.5%
66.666666677748
 
3.4%
55.555555563956
 
1.7%
5010663
 
4.7%
44.444444443745
 
1.7%
33.333333331911
 
0.8%
254365
 
1.9%

wime_oes_fahrt
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing54445
Missing (%)24.0%
Infinite0
Infinite (%)0.0%
Mean90.68931476
Minimum0
Maximum100
Zeros405
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-22T18:31:24.569909image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile66.66666667
Q177.77777778
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)22.22222222

Descriptive statistics

Standard deviation14.74616729
Coefficient of variation (CV)0.162600934
Kurtosis5.606273125
Mean90.68931476
Median Absolute Deviation (MAD)0
Skewness-2.017445543
Sum15606905.56
Variance217.4494496
MonotonicityNot monotonic
2022-11-22T18:31:24.660957image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
100107734
47.6%
7523036
 
10.2%
88.8888888916983
 
7.5%
77.7777777812813
 
5.7%
66.666666674225
 
1.9%
503252
 
1.4%
55.555555561485
 
0.7%
44.44444444943
 
0.4%
25586
 
0.3%
0405
 
0.2%
Other values (3)630
 
0.3%
(Missing)54445
24.0%
ValueCountFrequency (%)
0405
 
0.2%
11.11111111106
 
< 0.1%
22.22222222210
 
0.1%
25586
 
0.3%
33.33333333314
 
0.1%
44.44444444943
 
0.4%
503252
 
1.4%
55.555555561485
 
0.7%
66.666666674225
 
1.9%
7523036
10.2%
ValueCountFrequency (%)
100107734
47.6%
88.8888888916983
 
7.5%
77.7777777812813
 
5.7%
7523036
 
10.2%
66.666666674225
 
1.9%
55.555555561485
 
0.7%
503252
 
1.4%
44.44444444943
 
0.4%
33.33333333314
 
0.1%
25586
 
0.3%

Interactions

2022-11-22T18:31:19.822059image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:07.577498image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:09.031534image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.404465image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:11.660419image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:13.172481image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:14.676970image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.002725image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.376409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.601377image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:19.924939image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:07.708699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:09.135087image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.507617image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:11.836537image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:13.460851image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:14.779847image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.106970image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.479686image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.704793image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:20.048304image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:07.860241image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:09.242561image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.629550image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:11.997703image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:13.597854image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:14.906731image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.230937image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.602763image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.828495image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:20.171566image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:07.987023image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:09.349270image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.749080image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:12.119739image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:13.720603image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:15.105850image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.355941image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.725696image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.950899image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:20.290614image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:08.113680image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:09.594647image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.869769image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:12.245049image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:13.841312image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:15.270495image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.481011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.848676image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:19.074806image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:20.551423image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:08.244305image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:09.756336image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.991856image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:12.405377image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:13.978319image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:15.393427image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.605537image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.972535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:19.198115image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:20.677516image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:08.410657image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:09.940282image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:11.108857image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:12.614426image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:14.162499image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:15.516342image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.729868image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.095346image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:19.323318image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:20.794196image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:08.621177image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.069987image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:11.225908image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:12.757985image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:14.314995image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:15.635031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:16.849846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.221920image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:19.453357image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:20.912524image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:08.799011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.178390image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:11.354661image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:12.882043image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:14.434671image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:15.756086image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.117770image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.349396image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:19.582399image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:21.032867image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:08.920111image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:10.281181image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:11.475118image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:13.004169image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:14.553427image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:15.876661image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:17.247769image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:18.469211image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-22T18:31:19.701877image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-22T18:31:24.749615image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-22T18:31:24.905585image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-22T18:31:25.061550image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-22T18:31:25.218575image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-22T18:31:25.376539image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-22T18:31:21.173326image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-22T18:31:21.485874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-22T18:31:22.202522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-11-22T18:31:22.438008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexwime_personalwime_komfortwime_sauberkeitwime_puenktlichwime_platzangebotwime_gesamtzufwime_preis_leistungwime_fahrplanwime_oes_fahrt
0226335NaN25.075.075.0100.0100.0100.0100.075.0
122640075.075.050.0100.050.075.050.075.0100.0
2226141NaNNaNNaNNaNNaNNaN100.0100.0NaN
3226748NaN25.075.075.075.075.075.075.0100.0
4226415NaN100.050.0100.075.0100.050.050.050.0
522640875.075.075.0100.075.075.025.050.0100.0
6226754100.075.075.0100.075.0100.0100.0100.0100.0
7226404NaN75.075.050.0100.0100.075.075.075.0
8226403100.0100.075.0NaN100.075.0100.075.0NaN
922676075.0100.0100.0100.0100.075.075.075.0100.0

Last rows

df_indexwime_personalwime_komfortwime_sauberkeitwime_puenktlichwime_platzangebotwime_gesamtzufwime_preis_leistungwime_fahrplanwime_oes_fahrt
226527880100.077.77777855.555556100.000000100.00000077.77777877.777778100.000000100.000000
226528881NaN100.00000077.777778100.00000077.77777888.88888933.33333344.444444100.000000
226529882NaN77.77777866.666667100.00000066.66666777.77777888.88888977.77777877.777778
226530883100.088.88888977.777778100.00000088.88888988.88888988.88888988.88888988.888889
226531884NaN66.666667100.000000100.000000100.00000088.88888966.666667100.000000100.000000
226532885NaN100.000000100.000000100.000000100.000000100.000000100.000000100.000000100.000000
226533886100.022.22222255.55555611.1111110.00000033.3333330.00000022.22222244.444444
226534887NaN77.77777855.555556100.000000100.00000077.7777780.000000100.000000100.000000
226535888NaN66.66666777.777778100.00000066.66666777.77777855.55555655.55555666.666667
226536913NaN100.000000100.00000088.888889100.00000088.888889100.000000100.000000100.000000